Disk drive data protection using clusters containing error detection sectors

ABSTRACT

The present invention is related to methods and apparatus that can enhance the reliability of a hard drive by providing a built-in error check in the drive. Conventional hard drives can erroneously seek to an incorrect location on a platter of the hard drive. The erroneous seek corrupts the data stream and is difficult to detect and correct. Embodiments of the present invention can detect a logical block address assigned to a portion of the platter of the hard drive and thereby detect when an erroneous seek has occurred. Upon detection of an error, one embodiment of the present invention can further take corrective action to read from the correct portion of the platter.

RELATED APPLICATION

[0001] This application is a continuation application of U.S.application Ser. No. 09/732,244, entitled “DISK DRIVE DATA PROTECTIONUSING CLUSTERS CONTAINING ERROR DETECTION SECTORS,” filed Dec. 7, 2000,the entirety of which is incorporated by reference herein.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] The present invention generally relates to mass storage as usedin computer systems. In particular, the present invention relates toverifying the integrity of data from a storage device in a computersystem.

[0004] 2. Description of the Related Art

[0005] Computer systems frequently store voluminous quantities of dataon mass storage devices. A hard drive or a disk drive is one form ofmass storage. Popular interface formats that are used for hard drivesinclude the various versions of the small computer system interface(SCSI) and the AT attachment (ATA) interface standards.

[0006] Those in the art have sought to use low cost drives, such as ATAdrives, in relatively high reliability applications to save cost.However, an end user of an off-the-shelf hard drive often has noconvenient or timely way of identifying whether the drive selected isreliable or unreliable.

[0007] A disk drive typically includes an internal firmware drivencontroller, which can be prone to firmware bugs. One example of afirmware bug that results in corrupt data is a firmware bug in code thatis responsible for caching hard disk data to a memory buffer. Inaddition, a drive can occasionally seek to an incorrect location on ahard disk platter. For example, when a host computer or controllerrequests data from logical block address A, the drive can unexpectedlyreturn data from logical block address B instead of logical blockaddress A.

[0008] In a conventional drive, the erroneous seek occurs withoutwarning or indication and the host system is unaware that the drive haserroneously provided data from a wrong location. Although error checkingprotocols exist, the error checking in a conventional drive is limitedto the verification of data transmitted from the drive to a host on aninterconnect system, e.g., error checking within an ATA interface. Wherethe data in the drive is already corrupted by, for example, seeking tothe wrong physical location, conventional error checking schemes mayfail to detect the error.

[0009] One conventional approach to improve the reliability of a driveembeds error checking information in non-standard size sector. Forexample, one conventional approach uses special hard drives that storeerror checking information in two or more bytes of the non-standard sizesector. A standard sector contains 512 bytes. By contrast, the specialhard drives store error checking information in the extra two or morebytes of the larger than standard size sectors. A disadvantage to thespecial drives is a loss in economies of scale, as the special drivesdiffer from standard off-the-shelf drives and are produced in muchsmaller quantities.

SUMMARY OF THE INVENTION

[0010] Embodiments of the present invention overcome the disadvantagesof current systems by providing techniques that allow ordinary diskdrives to verify that a seek to a track has been properly commanded byverifying that the desired sector has been accessed. The techniques canapply to single disk drives or to multiple disk drive systems such as ina redundant array of inexpensive disks (RAID). One embodiment maintainsa reference to the logical block address of a cluster in an extra sectorof the cluster, which allows the embodiment to verify that the seek hadbeen properly executed.

[0011] Other embodiments according to the present inventionadvantageously maintain an error detection code, such as a CyclicRedundancy Check (CRC) checksum, in an extra sector of the cluster thatcan be used to verify the integrity of the remainder of the data in thecluster. In yet another embodiment, both the reference to the logicalblock address and the error detection code are stored in the extrasector.

[0012] One embodiment of the present invention groups sectors in thedisk drive into clusters of sectors. The cluster referenced herein isdifferent than the cluster used in a file allocation table (FAT). Thecluster of sectors according to an embodiment of the present inventionincludes multiple input/output data sectors and at least one “extra”sector. The extra sector maintains error checking information that canbe used to verify the data in the data sectors, to verify that aread/write head has performed a seek to the correct track, and the like.The error checking information is recalculated upon extraction of thedata from the cluster and compared with the previously storedcalculation. In one example, a reference to a logical block address of asector in the cluster is stored in the extra sector. In another example,the data verification portion of the error checking information conformsto a CRC-CCITT polynomial. The extra sectors occupy a portion of thestorage space of the disk drive and the logical block addresses used bya host computer system are translated to new logical block addressesused by the disk drive. A number of sectors requested for transfer canalso be translated to compensate for the sectors occupied by the extrasectors.

[0013] According to one embodiment of the present invention, to performa write operation to the disk drive, the old data from the data group ofthe cluster disk drive is first read, then modified with the new data,and then written to the disk drive. The read-modify-write process allowsa computation of the error checking information to be performed quicklyand efficiently. A memory buffer can also be used to temporarily storethe data to be written to the disk drive.

[0014] In another embodiment of the present invention, an indicator of alocation of a cluster of sectors is stored in an extra sector of thedisk drive. In one embodiment, the indicator corresponds to the logicalblock address (LBA) of the first data sector of the cluster of sectors.By maintaining a reference to the physical location of the accessedsector, the embodiment can detect whether the correct sector has beenaccessed. If an erroneous seek occurred, another seek can be commandedto the hard drive, by, for example, setting an interrupt to a firmwarecontroller. In one embodiment, the other seek can include a command tomove the read/write head to other tracks, a command to flush a memorycache, and the like.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015] These and other features of the invention will now be describedwith reference to the drawings summarized below. These drawings and theassociated description are provided to illustrate preferred embodimentsof the invention, and not to limit the scope of the invention.

[0016]FIG. 1 illustrates a typical sector format.

[0017]FIG. 2 illustrates a sector format according to one embodiment ofthe present invention.

[0018]FIG. 3 illustrates a conventional hard drive interconnected with ahost computer system.

[0019]FIG. 4 illustrates an auxiliary disk controller according to anembodiment of the present invention.

[0020]FIG. 5 is a flowchart of a process, according to an embodiment ofthe present invention, of writing to a hard drive.

[0021]FIG. 6 is a flowchart of a process, according to an embodiment ofthe present invention, of reading from a hard drive.

[0022]FIG. 7 consists of FIGS. 7A and 7B, and illustrates a blockdiagram of a circuit that generates extra sector information.

[0023]FIG. 8 illustrates an embodiment of the present inventionincorporated into a controller for a redundant array of inexpensivedisks (RAID).

[0024]FIG. 9 illustrates an alternative embodiment of the presentinvention for a redundant array of inexpensive disks (RAID). Glossary ofTerms Sector: The smallest unit of data that can be accessed in a harddisk. In a conventional drive, a sector holds 512 bytes of data. Somesystems do not address an individual sector; rather, the system relatesfiles to clusters. Cluster: A group or arrangement of sectors. Checksum:A generic term that refers to a calculation that can be used to verifythe integrity of data. Traditionally, a checksum was a simple summation,but the term now encompasses more sophisticated algorithms such as theCyclic Redundancy Check (CRC). CRC: A checksum based on a polynomialdivision of data.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0025] Although this invention will be described in terms of certainpreferred embodiments, other embodiments that are apparent to those ofordinary skill in the art, including embodiments which do not provideall of the benefits and features set forth herein, are also within thescope of this invention. Accordingly, the scope of the present inventionis defined only by reference to the appended claims.

[0026] A disk drive subsystem according to an embodiment of the presentinvention can advantageously detect when a disk drive has returnederroneous information by seeking to a wrong sector. In one embodiment,the subsystem organizes the hard disk into 33-sector clusters and storesa logical block address (LBA) of the first sector in the cluster and aCyclic Redundancy Check (CRC) checksum of the first 32 sectors of thecluster in the 33^(rd) sector of the cluster. In response to a readrequest from a host computer system, the subsystem verifies that a seekto the desired track occurred by reading and comparing the LBA stored inthe 33^(rd) sector to an expected LBA. The subsystem can performsubsequent measures to attempt to retrieve the correct information andcan further provide a warning to a host computer. Advantageously, thesubsystem is compatible with existing standard off the shelf diskdrives, although in some embodiments, the subsystem can also beincorporated into the disk drive itself.

[0027] A typical disk drive, such as a hard drive, stores data on aplatter that is spun at a relatively high rpm by a spindle. In some harddrives, multiple platters are used to increase the capacity of the harddrive. The hard drive stores data on the platter by encoding the dataand writing magnetic flux transitions or magnetic flux reversals on thesurface of the platter with a head. To retrieve the stored data, thehead reads the magnetic flux transitions and the hard drive decodes thetransitions to data. Of course, the head can include multiple read andmultiple write heads. A servo moves the head to align the head to atrack. Control logic, which can include an embedded microcontroller,controls the positioning of the head by the servo. The positioning ofthe head over a track is one operation that is performed during a seekto data.

[0028] The platter is typically arranged into numerous tracks (orcylinders for multiple platter drives) that can be written to and readby the head. For example, a typical platter has in the tens of thousandsof tracks. A typical track contains thousands of sectors, each of which,in a standard hard drive, contains 512 bytes of data. In a hard driveusing zone bit recording (ZBR), the tracks of the platter are furthergrouped into zones such that an outer track, with greater circumferencethan an inner track, contains more sectors than the inner track.

[0029] In some hard drives, a sector is specified by its track andsector number. A sector can also be specified by a logical track andsector number. In many conventional hard drives, each sector of the harddrive is assigned a logical block address (LBA). A host system requestsdata from the hard drive by specifying an LBA or a set of LBAs, and anembedded microcontroller translates the LBA received from the host tothe physical address of the sector on the platter. When the embeddedmicrocontroller erroneously commands a read from a sector that does notcorrespond to the LBA, the hard drive returns erroneous information.

[0030]FIG. 1 illustrates a typical sector format. A top row 102indicates the sectors and a bottom row 104 indicates the correspondingLBAs of the sectors above. In a typical disk drive, each sectorcorresponds to an LBA as shown in FIG. 1. For example, the millionthsector can correspond to the millionth LBA address. In the typicalsector format shown in FIG. 1, N corresponds to the highest valid LBAfor data. The highest valid LBA is typically determined by the number ofvalid sectors in the disk drive, and the number of valid sectors in thedisk drive is typically less than the highest possible address thatcould be addressed by the number of address bits that carry the LBA. Inone embodiment, the LBA is defined by 64 bits to allow a relativelylarge number of sectors to be addressed.

[0031]FIG. 2 illustrates a sector format according to one embodiment ofthe present invention. A top row 202 again indicates the sectors and abottom row 204 indicates the corresponding LBAs of the sectors as seenfrom a system external to the embodiment. In the illustrated embodiment,the sectors are grouped into 33-sector clusters. The first 32 sectors ofthe cluster store input/output (I/O) data, and the ₃₃ ^(rd) sector ofthe cluster stores error checking information, such as cyclic redundancycheck checksums and LBAs. In the illustrated embodiment, the LBAs forthe system are mapped such that the use of every ₃₃ ^(rd) sector tostore error checking information is transparent to the external system.

[0032]FIG. 3 illustrates a conventional hard drive 302 interconnectedwith a host computer system 304 by a cable or other interconnect 306. Inthe example shown, the hard drive 302 and the interconnect 306 conformto an ATA drive specification. The host computer system 304 writes datato and reads data from the hard drive 302 via the interconnect 306. Theinterconnect 306 carries an address, such as an LBA from the hostcomputer system, which indicates where the hard drive should read orwrite. The interconnect further carries data corresponding to theaddress.

[0033] Many operating systems that can execute on the host computersystem 304 do not individually access or address each sector of the harddrive. Rather, the operating system groups sectors into clusters so thatthe operating system does not have to keep track of individual sectors.The grouping into clusters allows the operating system to maintain arelatively small reference table to track the location of a file. Thefile allocation table (FAT) techniques, as used in the Windows®operating systems from Microsoft Corporation, are examples of techniquesthat relate files to clusters, and thereby sectors, on the hard disk.For example, the FAT techniques known as FAT16 and FAT32 referenceclusters in 16-bit and 28-bit tables, respectively. Typically, from 4 to64, and from 8 to 64 sectors are grouped into clusters, respectively,for the FAT techniques known as FAT16 and FAT32. Attention is drawn tothe fact that the term “cluster” applies generically to a group ofsectors and is used herein to designate a group of sectors as definedwithin a FAT of computer system 304 as well as a group of sectors thatis used by a subsystem or disk manager according to an embodiment of thepresent invention.

[0034]FIG. 4 illustrates a drive manager 400 according to one embodimentof the present invention. The drive manager communicates with the drive302 and with the host 304 through a drive bus 402 and a host bus 404,respectively. It will be understood by one of ordinary skill in the artthat the drive 302 can be a drive from any number of drive types, suchas ATA, SCSI, etc. The drive manager 400 includes a read modify writecircuit 406, a read circuit 408, a control circuit 410, a buffer 412, anLBA mapper 414, and optionally includes a CRC generator 416. The drivemanager 400 may, for example, be embodied within an application specificintegrated circuit (ASIC) of a disk array controller system.

[0035] In response to a write request from the host computer 304 to thedrive 302, the read modify write circuit 406 accepts data from the hostcomputer 304 and writes the data, together with additional data thatenhances the security of the host computer data, to the drive 302. Inone embodiment, the read modify write circuit 406 first reads theexisting data on the corresponding sectors, modifies the data with thedata received from the host computer 304, and writes the data to thedrive 302. The buffer 412 can temporarily store data received from thehost computer 304 until the data is written to the drive 302. The LBAmapper 414 converts an LBA from the host computer 304 to an LBA that isused by the drive 302. Further details of the LBA mapping performed bythe LBA mapper 414 are described later in connection with FIG. 5. Theoptional CRC generator 416 calculates a checksum-like computation of thedata written to the drive 302 so that the integrity of the stored datacan be verified at a later time. The CRC that is calculated and storedon the disk is in addition to a CRC calculation that may be computed bya hard disk when the hard disk is transmitting data across an interface.Further details of the CRC generator 416 are described later inconnection with FIG. 7.

[0036] In response to a read request from the host computer 304 to thedrive 302, the read circuit 408 accepts data from the drive 302 andverifies that the LBA associated with the read request corresponds tothe LBA associated with the read data. It will be understood by one ofordinary skill in the art that the LBA stored on the drive 302 can beeither the LBA that is referenced by the host system 304, or the LBAused by the drive. If the respective LBAs do not match, the drivemanager 400 can initiate a number of tasks in response, such as set anerror flag, interrupt a host processor, and the like, and can furtherretry reading the data from the drive 302 by commanding another read.Further details of reading data from the drive 302 are described inconnection with FIG. 6.

[0037] Embodiments of the present invention can advantageously detect avariety of errors that can be made by a disk drive. Examples of sourcesof error include where data is incorrectly written to a disk, where datais written to an incorrect location on a disk, where data is incorrectlyread from a disk, where data is read from an incorrect location on adisk, and an error in the error detection/correction circuitry, such asa stuck bit in a data line.

[0038]FIG. 5 is a flowchart of an overall process 500, according to anembodiment of the present invention, which can be executed by the drivemanager 400 to write to a hard drive. In State 502, the host computersends a stream of data to be written to the hard drive. To the hostcomputer, the addition of the drive manager 400 is transparent, i.e., nospecial accommodations need to be made. Typically, the host computersends an indication of an LBA from which to start writing data, sends anindication of the number of sectors that the data occupies, and sendsthe data itself. The process advances from State 502 to State 504.

[0039] In State 504, the drive manager 400 translates an LBA used by thesystem, referred to as a system LBA herein, to an LBA used by the drive,referred to as a drive LBA herein. In State 504, the drive manager 400also translates the number of sectors to be accessed and written to froma number of sectors from the system point of view and from the drive'spoint of view.

[0040] In accordance with one embodiment of the present invention, thesectors of the hard drive are grouped or arranged into clusters by thedrive manager 400. The cluster referenced herein is not to be confusedby the cluster used by a host computer in a FAT. The sectors in thecluster are further grouped into a data group and an error checkinggroup. The data group maintains the data that is normally stored in thedrive. The error checking group maintains data that can be used toverify that the data read from the drive came from the correct locationon the drive or that the data read from the drive matches with a storedCRC. In one embodiment, the error checking group is one extra sector percluster.

[0041] In one example, the data group comprises 32 sectors and the errorchecking group comprises 1 sector. One sector typically contains 512bytes and is thereby large enough to contain multiple error checking andverifying data. Although the error checking group can include more thanone sector, one sector is preferred to reduce the overhead disk spaceused to store the verification information. The sectors that comprisethe error checking group store verification information, such as logicalblock address and cyclic redundancy checks, and as a result, the errorchecking group sectors are no longer available to store normal I/O data,which in turn reduces the effective capacity of the drive. A firmwarecomputation can translate the standard capacity of the drive to theeffective capacity such that the host receives an indication of thestorage capacity of the drive available to the host. It will beunderstood by one of ordinary skill in the art that the effectivecapacity can be further reduced by the reservation of other sectors inthe disk drive for other purposes, such as to store configuration dataof the drive manager.

[0042] It will be understood by one of ordinary skill in the art thatthe number of sectors that are clustered into groups by the drivemanager 400 is a matter of design choice, and that there are tradeoffsassociated with the selection of cluster size. For example, there couldbe 6 sectors in the cluster or there could be 500 sectors in thecluster. For a relatively low number of sectors per cluster, theproportional amount of space on the drive dedicated to store errorchecking attributes increases and as a result, a larger percentage ofthe volume of the hard drive is no longer available to store data. Onthe other hand, where there are a relatively large number of sectors percluster, the proportional amount of space on the drive that is no longeravailable to store I/O data decreases, but more time is consumed tocomplete a read modify write cycle. The accessing of the cluster ofsectors, even where only one sector from the cluster is updated, takeslonger as the size of the cluster increases because relatively moresectors are read and written. For example, where the drive manager 400computes a CRC checksum of the data in a relatively large data group andstores the CRC checksum in an extra sector, a write to a portion of thedata group results in a reading of the relatively large data group, asubstitution of updated data to the relatively large data group, arecalculation of the CRC checksum, and a writing of the relatively largedata group to the disk drive.

[0043] The conversion of a number of sectors and an LBA from the hostcomputer to the drive are expressed below:${{SN}({drive})} = {{{SN}({host})} + \frac{{SN}({host})}{D}}$${{LBA}({drive})} = {{{LBA}({host})} + \frac{{LBA}({host})}{D}}$

[0044] In the formulas expressed above, SN(drive) relates to the numberof sectors to be transferred for the drive, and SN(host) relates to thenumber of sectors to be transferred for the host computer. Similarly,LBA(drive) relates to the logical block address of a sector in the drive(also referred to as a physical LBA), and LBA(host) relates to thelogical block address requested by the host computer. D relates to thenumber of data sectors in a cluster. Additionally, the result of thedivision is simply truncated and not rounded.

[0045] It will be understood by one of ordinary skill in the art thatimplementation of a division in hardware can require complex circuitry.Thus, the number of sectors in a data group of a cluster, denoted by Dabove, preferably conforms to a power of 2, e.g., 8, 16, 32, 64, and soon. By conforming to a power of 2, a division by the number of sectorsin a data group, D, can be implemented by a simple shift to the right.Preferably, the number of sectors in a data group of a cluster, D,ranges from about 16 sectors to 64 sectors. More preferably, the numberof sectors in a data group, D, includes 16, 32, or 64.

[0046] Typically, the number of sectors, SN(host), that can be requestedfor transfer is limited to an 8-bit field (one to 255 sectors). In oneembodiment, the number of sectors in the data group, D, is 32, and thenumber of sectors, SN(host), is further constrained to a multiple of 32,e.g., 32, 64, 96, 128, 160, and so on. SN(host) can thereby berepresented by the three most significant bits of the 8-bit field, suchas, xyz0 0000(b). The division of SN(host) by D, which in theillustrated embodiment is 32, can be easily accomplished by shifting tothe right 5 bits, which is then easily added back to SN(host). Theillustrated addition does not actually require the use of an addercircuit because the constraint of D, the number of sectors in a datagroup, conforming to a multiple of 32 in the illustrated embodimentclears the least significant bits of SN(host). The result of theoperation is SN(drive)=xyz0 0xyz(b).

[0047] Typically, an LBA for an ATA drive is specified as a 28-bitaddress. In one embodiment, 64 bits are reserved for the LBA. Any bitsthat are unused can be set to zero. In the illustrated embodiment, theLBA(drive) is computed by shifting LBA(host) 5 bits to the right, andadding the result back to LBA(host) in an adder. The “extra” sector,which is a 33^(rd) sector in the illustrated embodiment, stores the LBAand the CRC of the preceding 32 sectors. In one embodiment, one LBA isstored in the extra sector and the LBA corresponds to the “physical” LBAof the first sector in the cluster. In addition, the LBA stored in theextra sector can be that of any sector in the cluster, or a valuederived from the appropriate LBA, rather than the LBA itself. Of course,the “extra” sector can be the first sector in the cluster, rather thanthe last sector, or any other sector in the cluster.

[0048] It will be understood by one of ordinary skill in the art thatthe careful selection of the number of data sectors in a cluster and thecareful selection of constraints on the number of sectors to transfercan greatly simply computation. However, it will also be understood byone of ordinary skill in the art that although the use of the numbersthat are a power of 2, a multiple of 32, etc., can simplify circuitand/or firmware design, other numbers can be used as well.

[0049] The process advances from State 504 to State 506. In State 506,the drive manager 400 relates a sector to which the host system directsa write operation to a sector and a cluster on the drive. The processadvances from State 506 to State 508.

[0050] In State 508, the drive manager 400 reads from the selectedcluster before completing the write of data to the drive. Preferably,the contents of the selected cluster are read into a buffer. The processoptionally advances from State 508 to optional States 510 and 512. InState 510, the drive manager 400 verifies that the stored LBA in theextra sector of the cluster matches with the LBA that is requested(including translation of LBAs as necessary). In one embodiment, onlythe LBA of the first sector of a cluster is stored in the extra sector,but it will be understood by one of ordinary skill in the art that theLBA of any sector of the cluster, or another value related to the LBA ofa sector in the cluster can be stored and read such that the cluster canbe identified. In another embodiment, an identifier for the cluster isstored.

[0051] In State 510, the drive manager 400 can also recalculate theerror detection code such as the CRC of the stored data and compare therecalculated CRC to the CRC stored in the extra sector. In response to adetected error in the LBA or the CRC, the drive manager 400 can flag anerror, set an interrupt, and the like, and can return to State 508through State 513 to attempt to re-read the affected cluster, etc., asindicated by State 512. It will be understood by one of ordinary skillin the art that a re-read of the cluster can include commands to flush amemory cache in the drive and/or seek to other portions of the drive.

[0052] State 513 limits the number of times that the drive manager 400attempts to re-read the affected cluster, thereby preventing an infiniteloop. When a predetermined number of iterations through States 508, 510,512 and 513 occurs, the process proceeds from State 513 to State 515. InState 515, the drive manager signals an error, which can provide anindication to the host computer of a failed read.

[0053] Otherwise, the process advances to State 514. In State 514, thedrive manager 400 updates the contents of the buffer that correspond tothe new sector data indicated by the host computer. In anotherembodiment, the drive manager 400 updates the corresponding sector inthe hard drive directly. Of course, in an embodiment where an errordetecting code such as a checksum or CRC is not calculated, the sectorsin the cluster that are not updated do not need to be read. The processadvances from State 514 to State 516.

[0054] In State 516, the drive manager 400 recalculates the CRC for thedata sectors of the cluster so that the CRC corresponds with the newdata in the data sectors of the cluster. The process advances from State516 to State 518. In State 518, the drive manager 400 copies the clusterto the appropriate location in the drive. The entire process 500 orportions thereof can be repeated where multiple sectors are written bythe host computer.

[0055]FIG. 6 is a flowchart of a corresponding process 600, according toan embodiment of the present invention, of reading from the hard drive.States 602 to 612 of FIG. 6 are similar to States 502 to 512 of FIG. 5.In State 602, the host computer sends a request to read data from asector in the hard drive. During a read request, the addition of thedrive manager 400 disposed between the host computer and the drive istransparent to the host computer. Typically, the host computer requestsdata from a sector identified by the host computer's LBA, and requeststhe number of sectors for which the hard drive should supply data.

[0056] The process advances from State 602 to State 604. In State 604,the drive manager 400 translates the host computer system's LBA to thedrive LBA as described in connection with State 504 of FIG. 5.Similarly, the drive manager 400 translates the host computer system'srequested number of sectors to the drive's number of sectors. Theprocess advances from State 604 to State 606.

[0057] In State 606, the drive relates a sector that is requested by thehost computer to a cluster of sectors. In one example where one sectoris requested by the host computer and a cluster includes 33 sectors, 32of which are data sectors and one of which is an “extra” sector, thecluster of sectors corresponds to all 33 sectors in the same cluster asthe requested sector. The process advances from State 606 to State 608.

[0058] In State 608, the drive reads the contents of the clusterassociated with the requested sector. Preferably, the contents of thecluster are read into a buffer. The process advances from State 608 toState 610. In State 610, the drive manager 400 verifies that the storedLBA retrieved from the extra sector of the cluster matches with theexpected LBA. In one embodiment, the stored LBA is the LBA of the firstsector in the cluster. The drive manager 400 can optionally verify theintegrity of the retrieved data by calculating an error detection codesuch as a CRC of the data sectors of the cluster and comparing thecalculated CRC with the CRC stored in the extra sector of the cluster.The process advances from State 610 to State 612. In State 612, thedrive manager 400 can respond to an error in the LBA or CRC by, forexample, attempting to re-read the cluster as described in connectionwith State 512. Where re-reading is selected, State 612 returns to State608. As described in connection with FIG. 5, the number of times thatthe process returns to State 608 can be predetermined to avoid aninfinite loop. In addition, a maximum limit on the number of re-readcycles can be programmed or selected to avoid an infinite loop.

[0059] The process advances from State 612 to State 614. In State 614,the drive manager 400 selects the content of the cluster correspondingto the desired sector and accordingly transfers the selected content ofthe cluster to the host computer. Of course, the entire process 600 orportions thereof can be repeated where multiple sectors are requested bythe host computer.

[0060]FIG. 7 illustrates a block diagram of one embodiment of a circuit700 according to the present invention that generates extra sectorinformation. The circuit 700 can be implemented in an applicationspecific integrated circuit (ASIC), in a programmable gate array such asa field programmable gate array (FPGA), in a programmable logic device,(PLD), or within another type of integrated circuit device. Thefunctions of the circuit 700 could alternatively be implemented withinthe firmware of a microcontroller.

[0061] The host computer system communicates with a FIFO buffer 702. Tothe host computer system, the circuit 700 is transparent, i.e., lookssubstantially similar to a normal disk drive, albeit with reduced memorycapacity. A disk drive communicates with the circuit 700 throughtri-stateable buffers 704 and input buffers 706 or through transceivers.When the data is written from the FIFO buffer 702 to the disk drive, aread/{overscore (write)} control signal 708 is activated as a “low” inthe illustrated embodiment. The low state of the read/{overscore(write)} control signal 708 enables the outputs of the tri-stateablebuffers 704 and selects an output 710 of the FIFO buffer 702 through afirst multiplexer 712. It will be understood by one of ordinary skill inthe art that the output 710 and an input 714 of the FIFO buffer 702 canbe multiplexed as I/O signals and the function of the first multiplexer712 can be replaced by tri-stateable gates.

[0062] When data is written to the FIFO buffer 702, the read/{overscore(write)} control signal 708 is activated as a “high” in the illustratedembodiment. The high state of the read/{overscore (write)} controlsignal 708 disables the tri-stateable buffers 704 thereby avoiding buscontention and allowing the input buffers 706 to read data from the diskdrive. The high state of the read/{overscore (write)} control signal 708further controls the first multiplexer 712 so that the first multiplexerselects the input 714 to the FIFO buffer 702. The first multiplexer 712thereby allows one CRC generation circuit 716 to be used for calculationof both input and output data, i.e., allows reuse of the CRC generationcircuit 716 for the initial computation and the verificationcomputation. In one embodiment, the CRC generation circuit 716 computesthe CRC polynomial known as the CRC-CCITT polynomial.

[0063] As explained in connection with FIG. 5, one embodiment of thepresent invention reads data from the cluster or clusters associatedwith an updated sector or sectors and substitutes the updated content inthe FIFO buffer 702 before the data is written to the disk drive. Whendata is written to the disk drive, the data is output from the FIFObuffer 702 through the outputs 710 and is selected by a secondmultiplexer 718. The second multiplexer selects the output 710 when anextra sector control signal 720 is not activated. When the extra sectorcontrol signal 720 is activated, the second multiplexer selects as theinput that is coupled to a third multiplexer 721. The third multiplexer721 provides access to CRC and LBA verification data that issubsequently stored on the disk drive.

[0064] The data then passes through a first and a second write latch722, 724, and then through the tri-stateable buffers 704 and onto thedisk drive. It will be understood by one of ordinary skill in the artthat the first and the second write latches 722 can be interspersed inthe main data path as shown in the illustrated embodiment or can befanned out on a separate path. In the illustrated embodiment, the firstand second write latches 722 and 724 are configured to capturesequential words so that the sequential words can be compared in adual-rail comparator 726.

[0065] The dual-rail comparator 726 advantageously allows theillustrated embodiment to detect a broad range of latent defects in theerror detection/correction circuitry. Such latent defects, such as bitsstuck at logic “1” or logic “0” can be difficult to detect. For example,where bits are stuck at logic “0,” a checksum of the bits canconsequently sum to zero and indicate a valid condition, when in fact afault exists. By writing both the true and the complementary forms oferror checking data, the dual-rail comparator 726 detects such latentdefects, thereby permitting the circuit 700 to conform to a“self-checking checker.”

[0066] In addition, the dual-rail comparator 726 can distinguish betweenerrors where the disk drive is at fault and errors where the circuit 700is at fault. Where the stored LBA is stored in both inverted andnon-inverted forms, a further comparison between the two stored formscan reveal a defect in the error checking or error detecting circuitry.For example, where the comparison between inverted and non-invertedforms consistently detects an error in a bit, the circuit 700 can treatthe failure as a failure in the error-checking circuitry itself andignore the failure when accessing the disk drive.

[0067] The illustrated embodiment provides CRC and LBA verification datafrom the third multiplexer 721 in every other 16-bit word. A one'scomplement of each word of CRC and LBA data is stored in the 16-bit wordlocations that are skipped so that a simple compare operation, which canbe implemented by Exclusive-OR (XOR) gates, can detect an error in theverification data. A fault signal 728 indicates an error in theverification data. In response to the fault signal 728, the system canlog a failure, set an interrupt, re-read data, etc.

[0068] In one embodiment, a set of XOR gates 730 implements the one'scomplement by inverting the CRC data from the CRC generation circuit 716and the LBA data from an LBA generation circuit 732 in response to theleast significant bit (LSB) of an address of a word counter 734. The LBAgeneration circuit 732 converts the host computer's LBA to the diskdrive's LBA as described in connection with FIG. 4.

[0069] The word counter 734 increments in response to each 16-bit wordcount, and in the illustrated embodiment, the LSB of the word counter734 inverts every other word by asserting a logic “1” state as an inputto the set of XOR gates 730. It will be understood by one of ordinaryskill in the art that there are many ways to invert the CRC and LBAdata, such as by coupling selectable inverters to the output of thethird multiplexer 721. In one embodiment, the third multiplexer isimplemented with multiple tri-stateable gates, which are enabled throughdecoding logic coupled to the word counter 734.

[0070] When the circuit 700 writes the extra sector containing the CRCand the LBA data to the disk drive, the second multiplexer 718 selectsthe output of the third multiplexer 721. In one embodiment where asector contains 512 bytes, the extra sector also stores 512 bytes or 25616-bit words. Table I, below, illustrates an exemplary memory map withinthe extra sector that can be used to address the third multiplexer 721.TABLE I word address (b) content 0000 0000 CRC[15:0] 0000 0001 CRC[15:0]inverted 0000 0010 CRC[31:16] 0000 0011 CRC[31:16] inverted 0000 0100CRC[47:32] 0000 0101 CRC[47:32] inverted 0000 0110 CRC[63:48] 0000 0111CRC[63:48] inverted 0000 1000 LBA[15:0] 0000 1001 LBA[15:0] inverted0000 1010 LBA[31:16] 0000 1011 LBA[31:16] inverted 0000 1100 LBA[47:32]0000 1101 LBA[47:32] inverted 0000 1110 LBA[63:48] 0000 1111 LBA[63:48]inverted 0001 0000 to 1111 1111 reserved

[0071] During a read from the disk drive, the data read from the diskdrive is latched through a first and a second read latch 736, 738. Inanother embodiment, the first and the second read latches 736, 738 arefanned out on a separate path rather than interspersed between the diskdrive and the FIFO buffer 702 as shown in the illustrated embodiment.The first and the second read latches 736, 738 allow comparison of dataread from the disk drive. In one embodiment, alternate words in theextra sector are the one's complement of each other. A comparison of thealternate words is again performed by the dual-rail comparator 726,which can for example, indicate a fault, log errors, and initiatere-reads and the like in response to a failed match between a word andits complement.

[0072] Preferably, a CRC checksum of the data sectors is also stored inthe extra sector. Those of ordinary skill in the art recognize the CRCas one of many techniques to validate data that is stored ortransmitted. Many versions of the CRC exist, and these versions arecommonly referenced by their divisor or “generator polynomial.” In oneembodiment, the CRC polynomial selected is known as the CRC-CCITTpolynomial. Table II, below, lists several well-known CRC polynomials,though it will be understood that any appropriate polynomial or methodcan be used to verify the data. TABLE II Name Polynomial CRC-16X{circumflex over ( )}16 + X{circumflex over ( )}15 + X{circumflex over( )}2 + 1 CRC-CCITT X{circumflex over ( )}16 + X{circumflex over( )}12 + X{circumflex over ( )}5 + 1 CRC-32 X{circumflex over ( )}32 +X{circumflex over ( )}26 + X{circumflex over ( )}23 + X{circumflex over( )}22 + X{circumflex over ( )}16 + X{circumflex over ( )}12 +X{circumflex over ( )}11 + X{circumflex over ( )}10 + X{circumflex over( )}8 + X{circumflex over ( )}7 + X{circumflex over ( )}5 + X{circumflexover ( )}4 + X{circumflex over ( )}2 + X + 1

[0073] It will be understood by one of ordinary skill in the art that anembodiment according to the present invention can be implemented in avariety of environments. One embodiment of the present invention can beimplemented in a circuit board, such as a disk array controller board,that plugs into an ISA slot or a PCI slot of a standard computer.Another embodiment of the present invention is implemented within amotherboard of a host computer system.

[0074]FIG. 8 illustrates one embodiment of the present inventionincorporated into a controller for disk array controller, such as onefound in a redundant array of inexpensive disks (RAID). In a disk arraysystem, multiple hard drives are used to provide size, speed, andreliability advantages over a single hard drive. For example, in someRAID systems, the contents of a failed hard drive can be recreated fromparity bits. Because multiple hard drives are used in disk arrays, suchas RAID systems, the cost of each hard drive impacts the overall cost ofthe disk array system multiple times. Embodiments of the presentinvention advantageously allow disk array systems to reliably use lessexpensive ATA hard drives as well as more costly SCSI hard drives.

[0075]FIG. 8 illustrates a disk array controller 800 interspersedbetween the host computer 304, which can be a server, and multiple harddrives 802, 804, 806. The disk array controller manages and distributesthe storage of information across the multiple hard drives 802, 804,806. The disk array controller 800 can be physically co-located with thehost computer 304, or can be located separately. One embodiment of thepresent invention is advantageously incorporated into the disk arraycontroller 800.

[0076] A typical disk array controller 800 already incorporates manycomponents, which can be advantageously used to further incorporate anembodiment of the present invention at relatively little additionalcost. For example, a typical disk array controller already containsmemory buffers, microcontrollers, interfaces to drives, FPGAs, ASICs,PLDs and the like, some of which can be reconfigured to incorporate atleast a portion of an embodiment of the present invention. In oneembodiment, each automated controller controls a single, respective ATAdrive, and includes the disk manager circuit shown in FIG. 7. Thecontroller 800 can be implemented in hardware, such as in an ASIC.Alternatively, the controller 800 is implemented in software, such aswithin the firmware of a microcontroller.

[0077] The host computer 304 can communicate to the disk arraycontroller 800 through a variety of interfaces, including ATA, SCSI, ora Fibre Channel, and the disk array controller 800 can similarlycommunicate to the hard drive in a variety of interfaces as appropriatefor the particular hard drives used to implement the disk array. In oneembodiment, the interface from the host computer to the disk arraycontroller 800 differs from the interface used to access the harddrives. The disk array controller 800 may operate as generally describedin U.S. Pat. No. 6,098,114, the disclosure of which is herebyincorporated by reference.

[0078]FIG. 9 illustrates an alternative embodiment of the presentinvention for a RAID. In the embodiment illustrated in FIG. 9, diskmanagers 400 are coupled between the disk array controller 900 andmultiple hard drives 902, 904, and 906.

[0079] Various embodiments of the present invention have been describedabove. Although this invention has been described with reference tothese specific embodiments, the descriptions are intended to beillustrative of the invention and are not intended to be limiting.Various modifications and applications may occur to those skilled in theart without departing from the true spirit and scope of the invention asdefined in the appended claims.

What is claimed is:
 1. A method of controlling a disk drive so as todetermine whether data returned by the disk drive was read from acorrect location, the method comprising: storing data on the disk drivewithin clusters of sectors such that a cluster includes multipleinput/output (I/O) data sectors and an error detection sector, where theerror detection sector contains a value indicating a physical locationon the disk for the cluster; and in response to a read request from ahost, reading a cluster of data from the disk drive, and comparing thevalue contained in an error detection sector of the cluster to anexpected value to determine whether the disk drive accessed data from acorrect physical location on the disk drive.
 2. The method as defined inclaim 1, wherein the error detection sectors further contain errordetection codes for I/O data stored within the respective data sectors,and wherein reading a cluster further comprises determining whether theerror detection code within the cluster data is consistent with the I/Odata within the cluster data.
 3. The method as defined in claim 2,wherein the error detection code is a CRC code generated from all I/Odata stored within the corresponding cluster.
 4. The method as definedin claim 1, wherein the disk drive is an ATA disk drive.
 5. The methodas defined in claim 1, wherein each cluster contains exactly one errordetection sector.
 6. The method as defined in claim 1, wherein themethod is implemented within automated circuitry of a controller device.7. The method as defined in claim 1, wherein the method is performedsuch that the disk drive's hardware and firmware can remain unmodified.8. The method as defined in claim 1, wherein the sectors of the clusterbelong to a single disk drive.
 9. An error detection system thatcontrols a disk drive to provide an indication as to whether the diskdrive accessed a correct location, the system comprising: a storecircuit adapted to store data on the disk drive within clusters ofsectors, where a cluster includes multiple input/output (I/O) datasectors and an error detection sector, where the error detection sectorcontains a value that indicates a physical location on the disk thatcorresponds to a sector from the cluster; and a read circuit adapted toread a cluster of data from the disk drive in response to a read requestfrom a host, and to compare a value retrieved within the cluster of datato an expected value to determine whether the disk drive accessed datafrom a correct physical location on the disk drive.
 10. The system asdefined in claim 9, wherein the store circuit is further adapted tostore error detection codes for I/O data stored within the respectivedata sectors of the error detection sectors, and wherein the readcircuit is further adapted to determine whether the error detection codestored within the cluster data is consistent with the I/O data withinthe cluster data.
 11. The system as defined in claim 10, wherein theerror detection code is a CRC code generated from all I/O data storedwithin the corresponding cluster.
 12. The system as defined in claim 9,wherein the disk drive is an ATA disk drive.
 13. The system as definedin claim 9, wherein each cluster contains exactly one error detectionsector.
 14. The system as defined in claim 9, wherein the system isimplemented in an application specific integrated circuit (ASIC). 15.The method as defined in claim 9, wherein the data sectors and the errordetection sector of the cluster belong to a single disk drive.
 16. Amethod of detecting an error in a disk drive, the method comprising:receiving input/output data to be written to the disk drive; writing theinput/output data and additional verification data to the disk drive,where the input/output data and the additional verification data arestored in a cluster of the disk drive, where a sector containing theadditional verification data is separate from sectors that store theinput/output data in the cluster; receiving the input/output data andthe additional verification data from the disk drive, where theinput/output data and the additional verification data contains anerror; and comparing the additional verification data to an expectedverification data to detect the error.
 17. The method as defined inclaim 16, wherein the sectors of a cluster are within a single diskdrive.